Deploy Quantised Models

Welcome to NextAI, your trusted platform for deploying and managing AI models efficiently. This guide will take you through the process of deploying a quantized AI model on NextAI, allowing you to benefit from models that are optimized for performance and efficiency.

Open your browser and navigate to https://app.nextai.co.in.
Enter your registered email and password.
Click Login to access your dashboard.

Step 2: Navigate to Flashpoints

On your dashboard, find the Flashpoints section.
Click to enter and explore the models available.

Step 3: Deploy Model

Choose the model you wish to deploy within the NextAI ecosystem.
Follow the prompt to Deploy Model, customizing it to fit your project needs.

Step 4: Name Your Model Endpoint and Select Tier

Enter a unique name for your model’s endpoint.
Select the desired tier for your model deployment, based on your performance and scalability requirements.

Step 5: Choose Quantization

Choose the quantization for your model, such as FP16 for balancing efficiency and precision or INT4 for smaller hardware requirements but with potentially compromised accuracy.

Step 6: Add Your Payment Method

Add a payment method for seamless model access.
- Go to profile/account settings.
- Click on Usage & Billing > Payment Methods.
- Select Add Payment Method and provide details.

Step 7: Access Deployments Page

Post-deployment, head over to the Deployments page on your dashboard to view your active models.

Step 8: Select Deployed Model

Identify and select the model you have deployed from the list to view its comprehensive settings.

Step 9: Generate API Key

Within the detailed settings of your deployed model, look for the option to generate an API key.
Click to generate an API key tailored to your deployed model.

Step 10: Copy API Key

Upon generation, the API key will be displayed on the platform.
Copy this API key, as it is crucial for authenticating and utilizing the model in your applications or scripts.

Conclusion

Deploying a quantized model on NextAI simplifies the integration of efficient and high-performance AI capabilities into your projects. By following these steps, you can take advantage of quantized models that offer reduced model size and increased inference speed, all within your NextAI account. If you encounter any issues or have questions, feel free to reach out to our support team for assistance.

Introduction

NextAI Compute

Pricing

Finetuning

Cookbook

Deploy Quantised Models

Step 2: Navigate to Flashpoints

Step 3: Deploy Model

Step 4: Name Your Model Endpoint and Select Tier

Step 5: Choose Quantization

Step 6: Add Your Payment Method

Step 7: Access Deployments Page

Step 8: Select Deployed Model

Step 9: Generate API Key

Step 10: Copy API Key

Conclusion

Introduction

NextAI Compute

Pricing

Finetuning

Cookbook

​Step 1: Login to NextAI

​Step 2: Navigate to Flashpoints

​Step 3: Deploy Model

​Step 4: Name Your Model Endpoint and Select Tier

​Step 5: Choose Quantization

​Step 6: Add Your Payment Method

​Step 7: Access Deployments Page

​Step 8: Select Deployed Model

​Step 9: Generate API Key

​Step 10: Copy API Key

​Conclusion

Step 1: Login to NextAI

Step 2: Navigate to Flashpoints

Step 3: Deploy Model

Step 4: Name Your Model Endpoint and Select Tier

Step 5: Choose Quantization

Step 6: Add Your Payment Method

Step 7: Access Deployments Page

Step 8: Select Deployed Model

Step 9: Generate API Key

Step 10: Copy API Key

Conclusion